Goto

Collaborating Authors

 pruning decision




Multivariate Temporal Regression at Scale: A Three-Pillar Framework Combining ML, XAI, and NLP

Francis, Jiztom Kavalakkatt, Darr, Matthew J

arXiv.org Artificial Intelligence

This paper introduces a novel framework that accelerates the discovery of actionable relationships in high-dimensional temporal data by integrating machine learning (ML), explainable AI (XAI), and natural language processing (NLP) to enhance data quality and streamline workflows. Traditional methods often fail to recognize complex temporal relationships, leading to noisy, redundant, or biased datasets. Our approach combines ML-driven pruning to identify and mitigate low-quality samples, XAI-based interpretability to validate critical feature interactions, and NLP for future contextual validation, reducing the time required to uncover actionable insights by 40-60%. Evaluated on real-world agricultural and synthetic datasets, the framework significantly improves performance metrics (e.g., MSE, R2, MAE) and computational efficiency, with hardware-agnostic scalability across diverse platforms. While long-term real-world impacts (e.g., cost savings, sustainability gains) are pending, this methodology provides an immediate pathway to accelerate data-centric AI in dynamic domains like agriculture and energy, enabling faster iteration cycles for domain experts.


All-in-One Tuning and Structural Pruning for Domain-Specific LLMs

Lu, Lei, Wang, Zhepeng, Bao, Runxue, Wang, Mengbing, Li, Fangyi, Wu, Yawen, Jiang, Weiwen, Xu, Jie, Wang, Yanzhi, Gao, Shangqian

arXiv.org Artificial Intelligence

Existing pruning techniques for large language models (LLMs) targeting domain-specific applications typically follow a two-stage process: pruning the pretrained general-purpose LLMs and then fine-tuning the pruned LLMs on specific domains. However, the pruning decisions, derived from the pretrained weights, remain unchanged during fine-tuning, even if the weights have been updated. Therefore, such a combination of the pruning decisions and the finetuned weights may be suboptimal, leading to non-negligible performance degradation. To address these limitations, we propose ATP: All-in-One Tuning and Structural Pruning, a unified one-stage structural pruning and fine-tuning approach that dynamically identifies the current optimal substructure throughout the fine-tuning phase via a trainable pruning decision generator. Moreover, given the limited available data for domain-specific applications, Low-Rank Adaptation (LoRA) becomes a common technique to fine-tune the LLMs. In ATP, we introduce LoRA-aware forward and sparsity regularization to ensure that the substructures corresponding to the learned pruning decisions can be directly removed after the ATP process. ATP outperforms the state-of-the-art two-stage pruning methods on tasks in the legal and healthcare domains. More specifically, ATP recovers up to 88% and 91% performance of the dense model when pruning 40% parameters of LLaMA2-7B and LLaMA3-8B models, respectively.


Decay Pruning Method: Smooth Pruning With a Self-Rectifying Procedure

Yang, Minghao, Gao, Linlin, Li, Pengyuan, Li, Wenbo, Dong, Yihong, Cui, Zhiying

arXiv.org Artificial Intelligence

Deep Neural Networks (DNNs) have been widely used for various applications, such as image classification [22; 40], object segmentation [33; 35], and object detection [6; 43]. However, the increasing size and complexity of DNNs often result in substantial computational and memory requirements, posing challenges for deployment on resource-constrained platforms, such as mobile or embedded devices. Consequently, developing efficient methods to reduce the computational complexity and storage demands of large models, while minimizing performance degradation, has become essential. Network pruning is one of the most popular methods in model compression. Specifically, current network pruning methods are categorized into unstructured and structured pruning [5]. Unstructured pruning [11; 24] focuses on eliminating individual weights from a network to create fine-grained sparsity. Although these approaches achieve an excellent balance between model size reduction and accuracy retention, they often require specific hardware support for acceleration, which is impractical for general-purpose computing environments. Conversely, structured pruning [23; 18; 29] avoids these hardware dependencies by eliminating redundant network structures, thus introducing a more manageable and hardware-compatible form of sparsity. As a result, structured pruning has become popular and is extensively utilized.


Combining Relevance and Magnitude for Resource-Aware DNN Pruning

Chiasserini, Carla Fabiana, Malandrino, Francesco, Molner, Nuria, Zhao, Zhiqiang

arXiv.org Artificial Intelligence

Pruning neural networks, i.e., removing some of their parameters whilst retaining their accuracy, is one of the main ways to reduce the latency of a machine learning pipeline, especially in resource- and/or bandwidth-constrained scenarios. In this context, the pruning technique, i.e., how to choose the parameters to remove, is critical to the system performance. In this paper, we propose a novel pruning approach, called FlexRel and predicated upon combining training-time and inference-time information, namely, parameter magnitude and relevance, in order to improve the resulting accuracy whilst saving both computational resources and bandwidth. Our performance evaluation shows that FlexRel is able to achieve higher pruning factors, saving over 35% bandwidth for typical accuracy targets.

  Country:
  Genre: Research Report (0.64)
  Industry: Education (0.47)

Uncovering implementable dormant pruning decisions from three different stakeholder perspectives

Flynn, Deanna, Jain, Abhinav, Knight, Heather, Wilson, Cristina G., Grimm, Cindy

arXiv.org Artificial Intelligence

Dormant pruning, or the removal of unproductive portions of a tree while a tree is not actively growing, is an important orchard task to help maintain yield, requiring years to build expertise. Because of long training periods and an increasing labor shortage in agricultural jobs, pruning could benefit from robotic automation. However, to program robots to prune branches, we first need to understand how pruning decisions are made, and what variables in the environment (e.g., branch size and thickness) we need to capture. Working directly with three pruning stakeholders -- horticulturists, growers, and pruners -- we find that each group of human experts approaches pruning decision-making differently. To capture this knowledge, we present three studies and two extracted pruning protocols from field work conducted in Prosser, Washington in January 2022 and 2023. We interviewed six stakeholders (two in each group) and observed pruning across three cultivars -- Bing Cherries, Envy Apples, and Jazz Apples -- and two tree architectures -- Upright Fruiting Offshoot and V-Trellis. Leveraging participant interviews and video data, this analysis uses grounded coding to extract pruning terminology, discover horticultural contexts that influence pruning decisions, and find implementable pruning heuristics for autonomous systems. The results include a validated terminology set, which we offer for use by both pruning stakeholders and roboticists, to communicate general pruning concepts and heuristics. The results also highlight seven pruning heuristics utilizing this terminology set that would be relevant for use by future autonomous robot pruning systems, and characterize three discovered horticultural contexts (i.e., environmental management, crop-load management, and replacement wood) across all three cultivars.

  Genre:
  Industry: Food & Agriculture > Agriculture (1.00)

PDP: Parameter-free Differentiable Pruning is All You Need

Cho, Minsik, Adya, Saurabh, Naik, Devang

arXiv.org Artificial Intelligence

DNN pruning is a popular way to reduce the size of a model, improve the inference latency, and minimize the power consumption on DNN accelerators. However, existing approaches might be too complex, expensive or ineffective to apply to a variety of vision/language tasks, DNN architectures and to honor structured pruning constraints. In this paper, we propose an efficient yet effective train-time pruning scheme, Parameter-free Differentiable Pruning (PDP), which offers state-of-the-art qualities in model size, accuracy, and training cost. PDP uses a dynamic function of weights during training to generate soft pruning masks for the weights in a parameter-free manner for a given pruning target. While differentiable, the simplicity and efficiency of PDP make it universal enough to deliver state-of-the-art random/structured/channel pruning results on various vision and natural language tasks. For example, for MobileNet-v1, PDP can achieve 68.2% top-1 ImageNet1k accuracy at 86.6% sparsity, which is 1.7% higher accuracy than those from the state-of-the-art algorithms. Also, PDP yields over 83.1% accuracy on Multi-Genre Natural Language Inference with 90% sparsity for BERT, while the next best from the existing techniques shows 81.5% accuracy. In addition, PDP can be applied to structured pruning, such as N:M pruning and channel pruning. For 1:4 structured pruning of ResNet18, PDP improved the top-1 ImageNet1k accuracy by over 3.6% over the state-of-the-art. For channel pruning of ResNet50, PDP reduced the top-1 ImageNet1k accuracy by 0.6% from the state-of-the-art.


Relevant Entity Selection: Knowledge Graph Bootstrapping via Zero-Shot Analogical Pruning

Jarnac, Lucas, Couceiro, Miguel, Monnin, Pierre

arXiv.org Artificial Intelligence

Knowledge Graph Construction (KGC) can be seen as an iterative process starting from a high quality nucleus that is refined by knowledge extraction approaches in a virtuous loop. Such a nucleus can be obtained from knowledge existing in an open KG like Wikidata. However, due to the size of such generic KGs, integrating them as a whole may entail irrelevant content and scalability issues. We propose an analogy-based approach that starts from seed entities of interest in a generic KG, and keeps or prunes their neighboring entities. We evaluate our approach on Wikidata through two manually labeled datasets that contain either domain-homogeneous or -heterogeneous seed entities. We empirically show that our analogy-based approach outperforms LSTM, Random Forest, SVM, and MLP, with a drastically lower number of parameters. We also evaluate its generalization potential in a transfer learning setting. These results advocate for the further integration of analogy-based inference in tasks related to the KG lifecycle.


Safe Peeling for L0-Regularized Least-Squares with supplementary material

Guyard, Théo, Monnoyer, Gilles, Elvira, Clément, Herzet, Cédric

arXiv.org Artificial Intelligence

Abstract--We introduce a new methodology dubbed "safe The rest of the paper is organized as follows. Our peeling This paper focuses on the resolution of the so-called "l IV and its performance is regularized least-squares" problem given by illustrated in Sec. V. Proofs of the results presented in the p Unfortunately, this problem also turns out to be has to be understood component-wise, e.g., x [l,u] means NP-hard [4, Th. 1]. Moreover, η() denotes the indicator function flurry of contributions proposing tractable procedures able to which equals to 0 if the condition in argument is fulfilled and recover approximate solutions of (1-P). This observation, combined with some recent III. Due to space limitation, we only review the elements of A standard approach is to use a Branch-and-Bound (BnB) interest to introduce the proposed peeling method. We refer the algorithm that solves (1-P), see [6-11]. In this paper, we propose a new strategy, dubbed "safe peeling", to accelerate the exact resolution of (1-P). In a A. Pruning nutshell, our contribution is a computationally simple test applied at each node of the BnB decision tree to identify some The crux of BnB methods consists in identifying and discarding intervals of R In this section, we introduce our proposed peeling procedure. This assumption will One ubiquitous approach in the literature [7, 9, 13] to find be relaxed later on in Sec. A proof of this result is available in App. A. A consequence On the one hand, item i) implies that the pruning test (4) of preserving the pruning decision is that taking the new involves the following quantity (rather than p This additional constraint usually takes the form " M x This impairment pertains to a large class of mixed-integer problems and with M > 0 and is known as "Big-M" constraint, see [7, Sec.